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FINAL  REPORT 


"Networked  Guidance  and  Control  for  Mobile  Multi-Agent  Systems:  A  Multi- 
Terminal  (Network)  Information  Theoretic  Approach” 

AFOSR  Grant  FA95501110182 
Program  Manager:  Fariba  Fahroo,  Ph.D. 

1.  Outline  of  the  work  performed  and  main  results: 

(for  more  details  see  the  abstracts  in  Section  3  of  this  report) 

This  grant  sponsored  work  on  fundamental  aspects  of  the  following  research  themes: 

•  Analysis  and  design  of  optimal  distributed  estimation  and  control  systems  subject  to 
information  and  resource  constraints: 

o  We  have  pioneered  the  design  of  optimal  event-based  remote  estimation 
systems  for  different  types  of  information  constraints.  In  [2]  we  investigate  the 
case  when  multiple  sensor-estimator  pairs  must  share  a  resource-limited 
network,  while  in  [7]  we  formulate  and  solve  the  case  in  which  the 
communication  from  the  sensors  to  the  estimators  is  hindered  by  collisions.  In 
[6]  we  propose  methods  to  design  optimal  remote  estimation  systems  when 
channels  that  are  subject  to  energy  harvesting  mechanisms  are  used  to 
disseminate  information  from  the  sensors  to  the  estimators. 

o  In  [8]  we  propose  a  new  parametrization  for  the  norm-optimal  design  of 
decentralized  controllers.  In  contrast  with  prior  work,  [8]  builds  on  a 
coordinate-free  method  that  does  not  require  complicated  factorizations  and 
does  not  need  the  controller  to  be  stable.  The  thesis  [3]  addresses  aspects  of 
optimal  control  in  the  presence  of  secrecy  constraints. 

•  Analysis  and  design  of  optimal  feedback  supported  on  information  theoretic 
principles. 

o  In  [1],  we  used  information  theoretic  methods  to  obtain  a  convex  program  to 
maximize  the  number  of  recurrent  states  of  an  MDP.  This  has  applications  to 
the  design  of  policies  that  maximize  the  number  of  objects  that  can  be 
persistently  suveilled  by  a  mobile  agent  [5],  subject  to  dynamic  constraints 
on  the  agent.  Our  formulation  also  allows  for  constraints  on  power  and 
safety. 


2.  Transitions  to  DoD 

In  collaboration  with  ARL  and  NAVAIR,  we  are  exploring  applications  of  [6]  to  the  design  of 
remote  estimation  systems  that  include  a  human  operator.  The  operator  provides 
additional  expert  information  that  is  used  to  improve  the  performance  of  the  overall 
system.  The  main  idea  is  to  use  the  state-dependent  channel  proposed  in  [6]  to  model 
certain  human  behaviors,  such  as  bias  and  workload-related  loss  of  reliability. 
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[1]  Eduardo  Arvelo  and  Nuno  C.  Martins,  “Maximizing  the  set  of  recurrent  states  of  an  MDP 
subject  to  convex  constraints,”  Automatica  50  (2014),  pp.  994-998 

Abstract: 

This  paper  focuses  on  the  design  of  time-homogeneous  fully  observed  Markov  decision  processes 
(MDPs),  with  finite  state  and  action  spaces.  The  main  objective  is  to  obtain  policies  that  generate 
the  maximal  set  of  recurrent  states,  subject  to  convex  constraints  on  the  set  of  invariant 
probability  mass  functions.  We  propose  a  design  method  that  relies  on  a  finitely  parametrized 
convex  program  inspired  on  principles  of  entropy  maximization.  A  numerical  example  is 
provided  to  illustrate  these  ideas. 

[2]  Marcos  M.  Vasconcelos  and  Nuno  C.  Martins,  “Remote  Estimation  Games  over  Shared 
Networks,”  Proceedings  of  the  2014  Allerton  Conference  on  Communication,  Control,  and 
Computing  (page  numbers  not  yet  available) 

Abstract:  Consider  a  system  that  is  formed  by  two  sensors,  which  measure  a  random  variable 
each,  and  two  remote  estimators.  Each  estimator  is  tasked  to  produce  an  estimate  of  one  of  the 
variables  based  on  information  sent  to  it  by  its  corresponding  sensor.  We  propose  a  new  class  of 
problems  in  which  information  is  transmitted  from  the  sensors  to  the  estimators  via  a  collision 
channel  with  and  without  capture.  Our  results  characterize  the  structure  of  the  policies  that  are  in 
Nash  equilibrium  or  are  secure  in  a  well-defined  sense. 

[3]  Waseem  Malik,  Model  Based  Optimization  And  Design  Of  Secure  Systems,  Ph.D.  Thesis, 
University  of  Maryland  at  College  Park,  2013 

[4]  Aditya  Mahajan,  Nuno  C.  Martins,  Michael  C.  Rotkowitz  and  Serdar  Yuksel,  “Information 
Structures  in  Optimal  Decentralized  Control,”  2012  IEEE  CDC,  pp.  1291-1306 

Abstract:  This  paper  provides  a  comprehensive  characterization  of  information  structures  in  team 
decision  problems  and  their  impact  on  the  tractability  of  team  optimization.  Solution  methods  for 
team  decision  problems  are  presented  in  various  settings  where  the  discussion  is  structured  in  two 
foci:  The  first  is  centered  on  solution  methods  for  stochastic  teams  admitting  state-space 
formulations.  The  second  focus  is  on  norm-optimal  control  for  linear  plants  under  information 
constraints. 

[5]  Eduardo  Arvelo,  Eric  Kim  and  Nuno  C.  Martins,  “Maximal  Persistent  Surveillance  under 
Safety  Constraints,” ICRA  2013,  pp.  4048-4053 

Abstract:  This  paper  presents  a  method  for  the  design  of  time- invariant  memoryless  control 
policies  for  robots  tasked  with  persistent  surveillance  of  an  area  in  which  there  are 
forbidden  regions.  We  model  each  robot  as  a  controlled  Markov  chain  whose  state  comprises  its 
position  on  a  finite  two-dimensional  lattice  and  the  direction  of  motion.  The  goal  is  to  find  the 
minimum  number  of  robots  and  an  associated  time-invariant  memoryless  control  policy  that 
guarantees  that  the  largest  number  of  states  is  persistently  surveilled  without  ever  visiting  a 
forbidden  state.  We  propose  a  design  method  that  relies  on  a  finitely  parametrized  convex 
program  inspired  by  entropy  maximization  principles.  For  clarity  of  exposition,  we  focus  on 


simple  dynamics  and  state/control  spaces,  however  the  proposed  methodology  can  be  extended  to 
more  general  cases.  Numerical  examples  are  provided. 


(submitted) 

[6]  David  Ward,  Nuno  C.  Martins  and  Brian  M.  Sadler,  “Optimal  Remote  Estimation  over  a 
Class  of  Action  Dependent  Switching  Channels,”  submitted  to  the  2015  American  Control 
Conference 

Abstract:  Consider  a  remote  estimation  system  formed  by  a  channel  and  an  encoder  that  assesses 
a  continuous  random  variable  denoted  as  source.  The  internal  structure  of  the  channel  has  a  finite 
state  machine  (FSM)  whose  state  dictates  the  transmission  characteristics.  Each  state  of  the  FSM 
corresponds  to  a  discrete  memoryless  channel  (DMC).  At  each  channel  use,  information  is 
transmitted  from  the  encoder  to  the  channel  output  according  to  the  DMC  selected  by  the  current 
FSM  state.  This  class  of  channels  is  denoted  as  Action  Dependent  Switching  Channel,  or  ADS. 
An  action  feedback  policy  maps  the  channel’s  output  into  the  input  of  the  FSM.  This  paper 
investigates  methods  to  analyze  and  design  an  action  feedback  policy  and  encoder  that  minimize 
the  differential  entropy  of  the  source  conditioned  on  the  channel  output.  We  show  that  there  are 
optimal  action  feedback  policies  for  which  the  input  to  the  FSM  is  a  deterministic  sequence  that 
does  not  depend  on  the  channel  output.  We  also  provide  additional  structural  results  for  the  case 
when  the  FSM  parametrizes  a  set  of  Binary  Symmetric  Channels  (BSC)  with  differing  crossover 
probabilities.  Here,  we  consider  that  the  ADS  contains  states  of  no  transmission,  which  are 
modeled  as  a  BSC  crossover  probability  of  one  half.  In  this  case,  the  FSM  is  also  used  to  model 
channel  degradation  as  a  result  of  multiple  transmissions,  and  it  also  allows  for  recovery  when 
there  are  no  transmissions.  Our  results  show  that  the  optimal  encoder  and  action  feedback  policies 
can  be  computed  separately.  We  also  discuss  the  relevance  of  this  model  to  applications  for  which 
the  encoder  is  powered  by  an  energy  harvesting  system,  and  also  when  the  channel  represents  a 
human  decision  maker  whose  reliability  and  bias  are  affected  by  current  and  past  outputs. 

(Working  papers  to  be  submitted  until  the  end  of  2014) 

[7]  Marcos  M.  Vasconcelos  and  Nuno  C.  Martins,  “Optimal  Estimation  over  the  Collision 
Channel:  The  Static  Case,  ”  to  be  submitted  to  the  IEEE  Transactions  on  Automatic  Control  or 
Automatica  in  2014  -  draft  available  upon  request 

Abstract:  Consider  a  system  that  comprises  a  remote  estimator  and  two  sensors  that  observe  a 
random  variable  each.  The  goal  of  the  remote  estimator  is  to  produce  estimates  of  the  random 
variables  based  on  information  that  is  transmitted  to  it  by  the  sensors.  The  random  variables  are 
independent  and  information  is  transferred  from  the  sensors  to  the  estimator  via  a  collision 
channel.  Each  sensor  has  the  authority  to  decide  what  and  whether  to  transmit,  and  simultaneous 
transmissions  result  in  a  collision  symbol  to  be  received  at  the  estimator.  In  our  formulation,  there 
is  no  communication  between  the  sensors,  which  precludes  the  use  of  coordinated  strategies.  Our 
results  characterize  the  structure  of  policies  at  the  sensors  and  the  remote  estimator  that  are 
optimal  with  respect  to  an  expected  mean  squared  error.  We  show  that,  when  an  optimum  exists, 
there  are  optimal  policies  at  the  sensors  that  use  deterministic  threshold  strategies  to  decide  when 
to  transmit.  In  our  analysis,  we  prove  that  the  computation  of  a  person-  by-person  optimal 
threshold-based  policy  can  be  recast  as  a  one-bit  optimal  quantization  problem  for  which  the  cost 
is  non-uniform  across  representation  symbols.  We  show  the  existence  of  such  optimal  quantizers 
and  we  provide  an  iterative  procedure  akin  to  Lloyd-Max  algorithm  that  is  guaranteed  to 
converge  globally  to  a  locally  optimal  solution.  The  iterative  method  converges  to  an  optimal 
solution  in  all  numerical  examples  we  have  tried,  one  of  which  is  discussed  here.  We  also  present 


conditions  that  guarantee  the  existence  of  asymmetric  optimal  threshold  policies  even  when  the 
overall  framework  is  symmetric,  such  as  when  the  random  variables  are  Gaussian  zero  mean  with 
appropriately  chosen  variances. 

[8]  Serban  Sabau,  Nuno  C.  Martins  and  Michael  C.  Rotkowitz,  “A  convex  coordinate-free 
parameterization  subject  to  SQI  subspace  constraints,”  to  be  submitted  to  Automatica 

Abstract:  This  paper  addresses  the  design  of  controllers,  subject  to  given  subspace  constraints,  for 
fmitedimensional  linear  time-invariant  plants.  A  controller  is  deemed  admissible  if  it  satisfies  the 
constraints.  Prior  results  introduced  an  algebraic  test,  denoted  as  quadratic  invariance  (QI),  that 
uses  the  subspace  constraints  imposed  on  the  controller  and  the  plant  to  determine  the  existence 
of  a  convex  parametrization  of  all  admissible  stabilizing  controllers.  If  such  a  parametrization 
exists  then  it  can  be  obtained  via  Youla’s  classical  method,  subject  to  additional  convex 
conditions  on  the  Youla  parameter.  Here,  we  adopt  the  associated  notion  of  strong  quadratic 
invariance  (SQI),  which  is  equivalent  to  QI  in  many  cases  of  interest,  such  when  it  is  used  to 
express  sparsity  constraints  on  the  controller.  Under  the  assumption  that  the  subspace  constraints 
are  SQI,  this  paper  introduces  a  new  parametrization  that  is  not  based  on  Youla’s  method  and  yet 
is  convex  and  also  allows  norm-based  optimal  control  formulations  to  be  cast  as  model  matching 
problems.  It  is  based  on  the  so-called  coordinate  free  approach,  which  was  originally  developed 
in  a  centralized  setting,  and  here  is  extended  to  the  case  in  which  the  controllers  are  constrained  to 
a  SQI  subspace.  The  coordinate  free  approach  parametrizes  the  closed  loop  map  directly  and  it 
relies  on  the  knowledge  of  an  initial  admissible  stabilizing  controller  that  does  not  need  to  be 
stable.  This  is  in  contrast  with  previous  approaches  that  either  require  an  initial  admissible  stable 
stabilizing  controller,  or  are  based  on  Youla’s  method  that  requires  a  doubly-coprime 
factorization  of  the  plant  that  must  satisfy  additional  conditions  that  depend  on  the  constraints. 
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